Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(k8s): increase Sentry Clickhouse storage to 50Gi #3170

Merged
merged 3 commits into from
Dec 12, 2023

Conversation

themightychris
Copy link
Contributor

@themightychris themightychris commented Dec 7, 2023

Description

A Grafana alert has been consistently firing as of late about one of Sentry's storage volumes hovering around 80% utilization. Investigating this I found that this was not to do with Zookeeper which routinely needs old log/snapshot data purged, but rather was an issue with Clickhouse's data volume which stores Sentry's events legitimately growing over time. Reducing disk usage here would require actually deleting some of Sentry's event history. Given the long time period covered by the initially-provisioned 30Gi of data storage (the upstream default for the Helm chart we use to deploy Sentry) and relative cost of storage, I suggest it would be best to just expand the storage volume to 50Gi

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

How has this been tested?

I executed the following command within the kubernetes/apps/charts/sentry directory:

helm template sentry . -n sentry -f ../../values/sentry_sensitive.yaml -f ./values.yaml --debug

This renders the complete manifests offline for our Sentry deployment, and by diffing the before and after output I verified that only the storage request for the sentry-clickhouse StatefulSet was changed:

<           storage: "30Gi"
---
>           storage: "50Gi"

Post-merge follow-ups

Ensure that the following 3 PersistentVolumeClaims change from 30Gi to 50Gi:

  • sentry-clickhouse-data-sentry-clickhouse-0
  • sentry-clickhouse-data-sentry-clickhouse-1
  • sentry-clickhouse-data-sentry-clickhouse-2

It may additionally be necessary to log into the associated pods or nodes and use resize2fs to expand the filesystem to make use of the newly available physical disk space. Use df -h within each of the three pods to then confirm that the filesystems have 50Gi available and are now well below 80% utilization.

Copy link
Member

@evansiroky evansiroky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic makes sense, but I don't know enough about kubernetes config to know if the proposed change will work, so I'll trust the process.

@themightychris themightychris force-pushed the increase-sentry-clickhouse-storage branch 2 times, most recently from 2aded31 to 1e0848f Compare December 8, 2023 20:37
Copy link
Contributor

@SorenSpicknall SorenSpicknall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tentatively approving because my only requested changes are copy changes, and I anticipate you'll be able to make them without requiring a re-review to merge.

runbooks/workflow/disk-space.md Outdated Show resolved Hide resolved
runbooks/workflow/disk-space.md Outdated Show resolved Hide resolved
runbooks/workflow/disk-space.md Outdated Show resolved Hide resolved
@themightychris themightychris force-pushed the increase-sentry-clickhouse-storage branch 2 times, most recently from 24c0e07 to 2e28afb Compare December 12, 2023 16:55
@themightychris themightychris merged commit a5c032b into main Dec 12, 2023
4 checks passed
@themightychris themightychris deleted the increase-sentry-clickhouse-storage branch December 12, 2023 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants